Computing Discriminating and Generic Words

نویسندگان

  • Gregory Kucherov
  • Yakov Nekrich
  • Tatiana A. Starikovskaya
چکیده

We study the following three problems of computing generic or discriminating words for a given collection of documents. Given a pattern P and a threshold d, we want to report (i) all longest extensions of P which occur in at least d documents, (ii) all shortest extensions of P which occur in less than d documents, and (iii) all shortest extensions of P which occur only in d selected documents. For these problems, we propose efficient algorithms based on suffix trees and using advanced data structure techniques. For problem (i), we propose an optimal solution with constant running time per output word.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Use of Structure Codes (Counts) for Computing Topological Indices of Carbon Nanotubes: Sadhana (Sd) Index of Phenylenes and its Hexagonal Squeezes

Structural codes vis-a-vis structural counts, like polynomials of a molecular graph, are important in computing graph-theoretical descriptors which are commonly known as topological indices. These indices are most important for characterizing carbon nanotubes (CNTs). In this paper we have computed Sadhana index (Sd) for phenylenes and their hexagonal squeezes using structural codes (counts). Sa...

متن کامل

Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...

متن کامل

Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...

متن کامل

ECG Pattern Classification Based on Generic Feature Extraction

In this paper, we propose a mew ECG pattern classification model based on a generic feature extraction method. The proposed classifier is applied for indicating supraventricual arrhythmia in order to verify the performance of the proposed approach. A generic approach based on a histogram of 1 derivative of signals is applied for feature extraction. Principal component analysis (PCA) is consider...

متن کامل

Type-2 fuzzy set extension of DEMATEL method combined with perceptual computing for decision making

Most decision making methods used to evaluate a system or demonstrate the weak and strength points are based on fuzzy sets and evaluate the criteria with words that are modeled with fuzzy sets. The ambiguity and vagueness of the words and different perceptions of a word are not considered in these methods. For this reason, the decision making methods that consider the perceptions of decision...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012